Accurate Prediction of Protein Functional Class From Sequence in the Mycobacterium Tuberculosis and Escherichia Coli Genomes Using Data Mining
نویسندگان
چکیده
The analysis of genomics data needs to become as automated as its generation. Here we present a novel data-mining approach to predicting protein functional class from sequence. This method is based on a combination of inductive logic programming clustering and rule learning. We demonstrate the effectiveness of this approach on the M. tuberculosis and E. coli genomes, and identify biologically interpretable rules which predict protein functional class from information only available from the sequence. These rules predict 65% of the ORFs with no assigned function in M. tuberculosis and 24% of those in E. coli, with an estimated accuracy of 60-80% (depending on the level of functional assignment). The rules are founded on a combination of detection of remote homology, convergent evolution and horizontal gene transfer. We identify rules that predict protein functional class even in the absence of detectable sequence or structural homology. These rules give insight into the evolutionary history of M. tuberculosis and E. coli.
منابع مشابه
Accurate Prediction of Protein Functional Class from Sequence in the M. tuberculosis and E. coli Genomes using Data Mining
(2) Author to whom correspondence should be sent. Abstract The analysis of genomics data needs to become as automated as its generation. Here we present a novel data-mining approach to predicting protein functional class from sequence. This method is based on a combination of inductive logic programming clustering and rule learning. We demonstrate the effectiveness of this approach on the M. tu...
متن کاملMolecular Cloning, Expression and Purification of Protein TB10.4 Secreted by Mycobacterium Tuberculosis
Objective(s) Tuberculosis (TB) is the leading cause of mortality among the infectious diseases, especially in developing countries. One of the main goals in tuberculosis research is to identify antigens which have the ability of inducing cellular and/or humoral immunity in order to use them in diagnostic reagents or vaccine design. The aim of this study was to clone and express the TB'0.4 prot...
متن کاملProduction of MPT-64 recombinant protein from virulent strain of Mycobacterium bovis
Tuberculosis (TB) is a zoonotic infectious disease common to humans and animals which has been caused by a rod shaped, acid fast bacterium, called Mycobacterium bovis. The rapid and sensitive detection is a great challenge for TB diagnosis. The virulent strains of Mycobacterium tuberculosis complex (MTBC) have 16 different regions of difference (RD) in their genome which encode some important a...
متن کاملMycobacterium tuberculosis HspX/EsxS Fusion Protein: Gene Cloning, Protein Expression, and Purification in Escherichia coli
Background: The purpose of this study was to clone, express, and purify a novel multidomain fusion protein of Micobacterium tuberculosis (Mtb) in a prokaryotic system. Methods: An hspX/esxS gene construct was synthesized and ligated into a pGH plasmid, E. coli TOP10 cells were transformed, and the vector was purified. The vector containing the construct and pET-21b (+) plasmid were digested ...
متن کاملAcquired Antimicrobial Resistance Genes of Escherichia coli Obtained from Nigeria: In silico Genome Analysis
Background: Antimicrobial resistance is a global problem with enormous public health and economic impact. This study was carried out to get an overview of acquired antimicrobial resistance gene sequences in the genomes of Escherichia coli isolated from different food sources and the environment in Nigeria. Methods: To determine the acquired antimicrobial-resistant genes prevalence, genome asse...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Yeast (Chichester, England)
دوره 17 شماره
صفحات -
تاریخ انتشار 2000